AITopics | data augmentation approach

Collaborating Authors

data augmentation approach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Matching Ranks Over Probability Yields Truly Deep Safety Alignment

Vega, Jason, Singh, Gagandeep

arXiv.org Artificial IntelligenceDec-8-2025

A frustratingly easy technique known as the prefilling attack has been shown to effectively circumvent the safety alignment of frontier LLMs by simply prefilling the assistant response with an affirmative prefix before decoding. In response, recent work proposed a supervised fine-tuning (SFT) defense using data augmentation to achieve a \enquote{deep} safety alignment, allowing the model to generate natural language refusals immediately following harmful prefills. Unfortunately, we show in this work that the "deep" safety alignment produced by such an approach is in fact not very deep. A generalization of the prefilling attack, which we refer to as the Rank-Assisted Prefilling (RAP) attack, can effectively extract harmful content from models fine-tuned with the data augmentation defense by selecting low-probability "harmful" tokens from the top 20 predicted next tokens at each step (thus ignoring high-probability "refusal" tokens). We argue that this vulnerability is enabled due to the "gaming" of the SFT objective when the target distribution entropies are low, where low fine-tuning loss is achieved by shifting large probability mass to a small number of refusal tokens while neglecting the high ranks of harmful tokens. We then propose a new perspective on achieving deep safety alignment by matching the token ranks of the target distribution, rather than their probabilities. This perspective yields a surprisingly simple fix to the data augmentation defense based on regularizing the attention placed on harmful prefill tokens, an approach we call PRefill attEntion STOpping (PRESTO). Adding PRESTO yields up to a 4.7x improvement in the mean StrongREJECT score under RAP attacks across three popular open-source LLMs, with low impact to model utility.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2512.05518

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)

Add feedback

A Bayesian Data Augmentation Approach for Learning Deep Models

Neural Information Processing SystemsNov-21-2025, 14:13:21 GMT

Data augmentation is an essential part of the training process applied to deep learning models. The motivation is that a robust training process for deep learning models depends on large annotated datasets, which are expensive to be acquired, stored and processed. Therefore a reasonable alternative is to be able to automatically generate new annotated training samples using a process known as data augmentation. The dominant data augmentation approach in the field assumes that new training samples can be obtained via random geometric or appearance transformations applied to annotated training samples, but this is a strong assumption because it is unclear if this is a reliable generative model for producing new training samples. In this paper, we provide a novel Bayesian formulation to data augmentation, where new annotated training points are treated as missing variables and generated based on the distribution learned from the training set. For learning, we introduce a theoretically sound algorithm --- generalised Monte Carlo expectation maximisation, and demonstrate one possible implementation via an extension of the Generative Adversarial Network (GAN). Classification results on MNIST, CIFAR-10 and CIFAR-100 show the better performance of our proposed method compared to the current dominant data augmentation approach mentioned above --- the results also show that our approach produces better classification results than similar GAN models.

bayesian data augmentation approach, data augmentation approach, training sample, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data Augmentation Techniques for Chinese Disease Name Normalization

Cui, Wenqian, Fu, Xiangling, Liu, Shaohui, Gu, Mingjun, Liu, Xien, Wu, Ji, King, Irwin

arXiv.org Artificial IntelligenceJan-2-2025

Disease name normalization is an important task in the medical domain. It classifies disease names written in various formats into standardized names, serving as a fundamental component in smart healthcare systems for various disease-related functions. Nevertheless, the most significant obstacle to existing disease name normalization systems is the severe shortage of training data. Consequently, we present a novel data augmentation approach that includes a series of data augmentation techniques and some supporting modules to help mitigate the problem. Through extensive experimentation, we illustrate that our proposed approach exhibits significant performance improvements across various baseline models and training objectives, particularly in scenarios with limited training data

axis word, data augmentation approach, disease name, (11 more...)

arXiv.org Artificial Intelligence

2501.01195

Country:

Asia > China > Beijing > Beijing (0.07)
Asia > China > Hong Kong (0.05)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Add feedback

Mixing Signals: Data Augmentation Approach for Deep Learning Based Modulation Recognition

Xu, Xinjie, Chen, Zhuangzhi, Xu, Dongwei, Zhou, Huaji, Yu, Shanqing, Zheng, Shilian, Xuan, Qi, Yang, Xiaoniu

arXiv.org Artificial IntelligenceOct-29-2024

With the rapid development of deep learning, automatic modulation recognition (AMR), as an important task in cognitive radio, has gradually transformed from traditional feature extraction and classification to automatic classification by deep learning technology. However, deep learning models are data-driven methods, which often require a large amount of data as the training support. Data augmentation, as the strategy of expanding dataset, can improve the generalization of the deep learning models and thus improve the accuracy of the models to a certain extent. In this paper, for AMR of radio signals, we propose a data augmentation strategy based on mixing signals and consider four specific methods (Random Mixing, Maximum-Similarity-Mixing, $\theta-$Similarity Mixing and n-times Random Mixing) to achieve data augmentation. Experiments show that our proposed method can improve the classification accuracy of deep learning based AMR models in the full public dataset RML2016.10a. In particular, for the case of a single signal-to-noise ratio signal set, the classification accuracy can be significantly improved, which verifies the effectiveness of the methods.

data augmentation approach, mixing signal, modulation recognition, (1 more...)

arXiv.org Artificial Intelligence

2204.03737

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-modal Intent Detection

Huang, Shijue, Qin, Libo, Wang, Bingbing, Tu, Geng, Xu, Ruifeng

arXiv.org Artificial IntelligenceDec-31-2023

Multi-modal intent detection aims to utilize various modalities to understand the user's intentions, which is essential for the deployment of dialogue systems in real-world scenarios. The two core challenges for multi-modal intent detection are (1) how to effectively align and fuse different features of modalities and (2) the limited labeled multi-modal intent training data. In this work, we introduce a shallow-to-deep interaction framework with data augmentation (SDIF-DA) to address the above challenges. Firstly, SDIF-DA leverages a shallow-to-deep interaction module to progressively and effectively align and fuse features across text, video, and audio modalities. Secondly, we propose a ChatGPT-based data augmentation approach to automatically augment sufficient training data. Experimental results demonstrate that SDIF-DA can effectively align and fuse multi-modal features by achieving state-of-the-art performance. In addition, extensive analyses show that the introduced data augmentation approach can successfully distill knowledge from the large language model.

data augmentation, detection, modality, (14 more...)

arXiv.org Artificial Intelligence

2401.00424

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition

Yang, Linyi, Yuan, Lifan, Cui, Leyang, Gao, Wenyang, Zhang, Yue

arXiv.org Artificial IntelligenceDec-25-2023

Few-shot Named Entity Recognition (NER) is imperative for entity tagging in limited resource domains and thus received proper attention in recent years. Existing approaches for few-shot NER are evaluated mainly under in-domain settings. In contrast, little is known about how these inherently faithful models perform in cross-domain NER using a few labeled in-domain examples. This paper proposes a two-step rationale-centric data augmentation method to improve the model's generalization ability. Results on several datasets show that our model-agnostic method significantly improves the performance of cross-domain NER tasks compared to previous state-of-the-art methods, including the data augmentation and prompt-tuning methods. Our codes are available at https://github.com/lifan-yuan/FactMix.

dataset, entity recognition, factmix, (9 more...)

arXiv.org Artificial Intelligence

2208.11464

Country:

Europe > United Kingdom (0.05)
Oceania > Australia (0.04)
North America > Dominican Republic (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Attention-stacked Generative Adversarial Network (AS-GAN)-empowered Sensor Data Augmentation for Online Monitoring of Manufacturing System

Li, Yuxuan, Liu, Chenang

arXiv.org Artificial IntelligenceJun-9-2023

Machine learning (ML) has been extensively adopted for the online sensing-based monitoring in advanced manufacturing systems. However, the sensor data collected under abnormal states are usually insufficient, leading to significant data imbalanced issue for supervised machine learning. A common solution for this issue is to incorporate data augmentation technique, i.e., augmenting the available abnormal states data (i.e., minority samples) via synthetic generation. To generate the high-quality minority samples effectively, it is vital to learn the underlying distribution of the abnormal states data. In recent years, the generative adversarial network (GAN)-based approaches become popular to learn data distribution as well as perform data augmentation. However, in practice, the quality of generated samples from GAN-based data augmentation may vary drastically. In addition, the sensor signals are collected sequentially by time from the manufacturing systems, which means the consideration of sequential information is also very important in data augmentation. To address these limitations, inspired by the multi-head attention mechanism, this paper proposed an attention-stacked GAN (AS-GAN) architecture for the sensor data augmentation of online monitoring in advanced manufacturing. In this proposed AS-GAN, a new attention-stacked framework is incorporated to strengthen the generator in GAN with the learning capability of considering sequential information. Furthermore, the developed attention-stacked framework also greatly helps to improve the quality of generated sensor signals. The case studies conducted in additive manufacturing also successfully validate the effectiveness of AS-GAN to augment high-quality artificial multi-channel sensor signals for online monitoring of manufacturing systems.

artificial intelligence, machine learning, sensor signal, (13 more...)

arXiv.org Artificial Intelligence

2306.06268

Country: North America > United States > Oklahoma > Payne County > Stillwater (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Reference Matters: Benchmarking Factual Error Correction for Dialogue Summarization with Fine-grained Evaluation Framework

Gao, Mingqi, Wan, Xiaojun, Su, Jia, Wang, Zhefeng, Huai, Baoxing

arXiv.org Artificial IntelligenceJun-8-2023

Factuality is important to dialogue summarization. Factual error correction (FEC) of model-generated summaries is one way to improve factuality. Current FEC evaluation that relies on factuality metrics is not reliable and detailed enough. To address this problem, we are the first to manually annotate a FEC dataset for dialogue summarization containing 4000 items and propose FERRANTI, a fine-grained evaluation framework based on reference correction that automatically evaluates the performance of FEC models on different error categories. Using this evaluation framework, we conduct sufficient experiments with FEC approaches under a variety of settings and find the best training modes and significant differences in the performance of the existing approaches on different factual error categories.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.05119

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > Dominican Republic (0.04)
(8 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science > Data Quality > Data Cleaning (0.61)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback

Methods for addressing class imbalance in deep learning-based natural language processing

AIHubMar-30-2023, 08:43:54 GMT

Figure 1: Modern Transformer-based Natural Language Processing (NLP) methods still struggle with class imbalance: class-wise performance (second row, each dot represents one class) decreases with class frequency in training data (first row) for a variety of NLP tasks. Natural Language Processing (NLP) tasks are often addressed by training supervised models using manually labeled datasets. This comes with the challenge that categories rarely occur with the exact same frequency; in practice, the distribution of samples across classes is usually highly skewed. In sentiment analysis, there may be a large number of negative reviews, with only a small number of positive reviews. Such class imbalance in the training and evaluation datasets can pose a challenge for NLP models, which are more heavily influenced by majority class data during training.

class imbalance, classification, minority class, (9 more...)

AIHub

Genre: Play > Prospect (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

UI Layers Merger: Merging UI layers via Visual Learning and Boundary Prior

Chen, Yun-nong, Zhen, Yan-kun, Shi, Chu-ning, Li, Jia-zhi, Chen, Liu-qing, Li, Ze-jian, Sun, Ling-yun, Zhou, Ting-ting, Chang, Yan-fang

arXiv.org Artificial IntelligenceSep-3-2022

With the fast-growing GUI development workload in the Internet industry, some work on intelligent methods attempted to generate maintainable front-end code from UI screenshots. It can be more suitable for utilizing UI design drafts that contain UI metadata. However, fragmented layers inevitably appear in the UI design drafts which greatly reduces the quality of code generation. None of the existing GUI automated techniques detects and merges the fragmented layers to improve the accessibility of generated code. In this paper, we propose UI Layers Merger (UILM), a vision-based method, which can automatically detect and merge fragmented layers into UI components. Our UILM contains Merging Area Detector (MAD) and a layers merging algorithm. MAD incorporates the boundary prior knowledge to accurately detect the boundaries of UI components. Then, the layers merging algorithm can search out the associated layers within the components' boundaries and merge them into a whole part. We present a dynamic data augmentation approach to boost the performance of MAD. We also construct a large-scale UI dataset for training the MAD and testing the performance of UILM. The experiment shows that the proposed method outperforms the best baseline regarding merging area detection and achieves a decent accuracy regarding layers merging.

design draft, fragmented layer, ui component, (13 more...)

arXiv.org Artificial Intelligence

2206.13389

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(16 more...)

Genre: Research Report > New Finding (0.94)

Industry: Information Technology (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Graphics (0.89)
(2 more...)

Add feedback